259 research outputs found

    Support Efficient, Scalable, and Online Social Spam Detection in System

    Get PDF
    The broad success of online social networks (OSNs) has created fertile soil for the emergence and fast spread of social spam. Fake news, malicious URL links, fraudulent advertisements, fake reviews, and biased propaganda are bringing serious consequences for both virtual social networks and human life in the real world. Effectively detecting social spam is a hot topic in both academia and industry. However, traditional social spam detection techniques are limited to centralized processing on top of one specific data source but ignore the social spam correlations of distributed data sources. Moreover, a few research efforts are conducting in integrating the stream system (e.g., Storm, Spark) with the large-scale social spam detection, but they typically ignore the specific details in managing and recovering interim states during the social stream data processing. We observed that social spammers who aim to advertise their products or post victim links are more frequently spreading malicious posts during a very short period of time. They are quite smart to adapt themselves to old models that were trained based on historical records. Therefore, these bring a question: how can we uncover and defend against these online spam activities in an online and scalable manner? In this dissertation, we present there systems that support scalable and online social spam detection from streaming social data: (1) the first part introduces Oases, a scalable system that can support large-scale online social spam detection, (2) the second part introduces a system named SpamHunter, a novel system that supports efficient online scalable spam detection in social networks. The system gives novel insights in guaranteeing the efficiency of the modern stream applications by leveraging the spam correlations at scale, and (3) the third part refers to the state recovery during social spam detection, it introduces a customizable state recovery framework that provides fast and scalable state recovery mechanisms for protecting large distributed states in social spam detection applications

    Sentiment Analysis of Long-term Social Data during the COVID-19 Pandemic

    Get PDF
    The COVID-19 pandemic has bringing the ā€œinfodemicā€ in the social media worlds. Various social platforms play a significant role in instantly acquiring the latest updates of the pandemic. Social media such as Twitter and Facebook produce vast amounts of posts related to the virus, vaccines, economics, and politics. In order to figure out how public opinion and sentiments are expressed during the pandemic, this work analyzes the long-term social posts from social media and conducts sentiment analysis on tweets within 12 months. Our findings show the trend topics of long-term social communities during the pandemic and express peopleā€™s attitudes towards progress of major actions during the pandemic. We explore the main topics during the prolonged pandemic, including information surrounding economics, vaccines, and politics. Besides, we show the differences in gender-based attitudes and propose future research questions refer to the ā€œinfodemicā€. We believe that our work contributes to attracting public attention to the ā€œinfodemicā€ of the social crisis

    Making Machine Learning Datasets and Models FAIR for HPC: A Methodology and Case Study

    Full text link
    The FAIR Guiding Principles aim to improve the findability, accessibility, interoperability, and reusability of digital content by making them both human and machine actionable. However, these principles have not yet been broadly adopted in the domain of machine learning-based program analyses and optimizations for High-Performance Computing (HPC). In this paper, we design a methodology to make HPC datasets and machine learning models FAIR after investigating existing FAIRness assessment and improvement techniques. Our methodology includes a comprehensive, quantitative assessment for elected data, followed by concrete, actionable suggestions to improve FAIRness with respect to common issues related to persistent identifiers, rich metadata descriptions, license and provenance information. Moreover, we select a representative training dataset to evaluate our methodology. The experiment shows the methodology can effectively improve the dataset and model's FAIRness from an initial score of 19.1% to the final score of 83.0%

    Duality Cascade in Brane Inflation

    Full text link
    We show that brane inflation is very sensitive to tiny sharp features in extra dimensions, including those in the potential and in the warp factor. This can show up as observational signatures in the power spectrum and/or non-Gaussianities of the cosmic microwave background radiation (CMBR). One general example of such sharp features is a succession of small steps in a warped throat, caused by Seiberg duality cascade using gauge/gravity duality. We study the cosmological observational consequences of these steps in brane inflation. Since the steps come in a series, the prediction of other steps and their properties can be tested by future data and analysis. It is also possible that the steps are too close to be resolved in the power spectrum, in which case they may show up only in the non-Gaussianity of the CMB temperature fluctuations and/or EE polarization. We study two cases. In the slow-roll scenario where steps appear in the inflaton potential, the sensitivity of brane inflation to the height and width of the steps is increased by several orders of magnitude comparing to that in previously studied large field models. In the IR DBI scenario where steps appear in the warp factor, we find that the glitches in the power spectrum caused by these sharp features are generally small or even unobservable, but associated distinctive non-Gaussianity can be large. Together with its large negative running of the power spectrum index, this scenario clearly illustrates how rich and different a brane inflationary scenario can be when compared to generic slow-roll inflation. Such distinctive stringy features may provide a powerful probe of superstring theory.Comment: Corrections in Eq.(5.47), Eq (5.48), Eq(5.49) and Fig

    Comparing Brane Inflation to WMAP

    Full text link
    We compare the simplest realistic brane inflationary model to recent cosmological data, including WMAP 3-year cosmic microwave background (CMB) results, Sloan Digital Sky Survey luminous red galaxies (SDSS LRG) power spectrum data and Supernovae Legacy Survey (SNLS) Type 1a supernovae distance measures. Here, the inflaton is simply the position of a D3D3-brane which is moving towards a DĖ‰3\bar{D}3-brane sitting at the bottom of a throat (a warped, deformed conifold) in the flux compactified bulk in Type IIB string theory. The analysis includes both the usual slow-roll scenario and the Dirac-Born-Infeld scenario of slow but relativistic rolling. Requiring that the throat is inside the bulk greatly restricts the allowed parameter space. We discuss possible scenarios in which large tensor mode and/or non-Gaussianity may emerge. Here, the properties of a large tensor mode deviate from that in the usual slow-roll scenario, providing a possible stringy signature. Overall, within the brane inflationary scenario, the cosmological data is providing information about the properties of the compactification of the extra dimensions.Comment: 45 pages 11 figure

    High-density molecular characterization and association mapping in Ethiopian durum wheat landraces reveals high diversity and potential for wheat breeding

    Get PDF
    Durum wheat (Triticum turgidum subsp. durum) is a key crop worldwide, yet its improvement and adaptation to emerging environmental threats is made difficult by the limited amount of allelic variation included in its elite pool. New allelic diversity may provide novel loci to international crop breeding through quantitative trait loci (QTL) mapping in unexplored material. Here we report the extensive molecular and phenotypic characterization of hundreds of Ethiopian durum wheat landraces and several Ethiopian improved lines. We test 81,587 markers scoring 30,155 single nucleotide polymorphisms and use them to survey the diversity, structure, and genome-specific variation in the panel. We show the uniqueness of Ethiopian germplasm using a siding collection of Mediterranean durum wheat accessions. We phenotype the Ethiopian panel for ten agronomic traits in two highly diversified Ethiopian environments for two consecutive years, and use this information to conduct a genome wide association study. We identify several loci underpinning agronomic traits of interest, both confirming loci already reported and describing new promising genomic regions. These loci may be efficiently targeted with molecular markers already available to conduct marker-assisted selection in Ethiopian and international wheat. We show that Ethiopian durum wheat represents an important and mostly unexplored source of durum wheat diversity. The panel analyzed in this study allows the accumulation of QTL mapping experiments, providing the initial step for a quantitative, methodical exploitation of untapped diversity in producing a better wheat

    High-density molecular characterization and association mapping in Ethiopian durum wheat landraces reveals high diversity and potential for wheat breeding

    Get PDF
    Durum wheat (Triticum turgidum subsp. durum) is a key crop worldwide, yet its improvement and adaptation to emerging environmental threats is made difficult by the limited amount of allelic variation included in its elite pool. New allelic diversity may provide novel loci to international crop breeding through quantitative trait loci (QTL) mapping in unexplored material. Here we report the extensive molecular and phenotypic characterization of hundreds of Ethiopian durum wheat landraces and several Ethiopian improved lines. We test 81,587 markers scoring 30,155 single nucleotide polymorphisms and use them to survey the diversity, structure, and genome-specific variation in the panel. We show the uniqueness of Ethiopian germplasm using a siding collection of Mediterranean durum wheat accessions. We phenotype the Ethiopian panel for ten agronomic traits in two highly diversified Ethiopian environments for two consecutive years, and use this information to conduct a genome wide association study. We identify several loci underpinning agronomic traits of interest, both confirming loci already reported and describing new promising genomic regions. These loci may be efficiently targeted with molecular markers already available to conduct marker-assisted selection in Ethiopian and international wheat. We show that Ethiopian durum wheat represents an important and mostly unexplored source of durum wheat diversity. The panel analyzed in this study allows the accumulation of QTL mapping experiments, providing the initial step for a quantitative, methodical exploitation of untapped diversity in producing a better wheat

    Global, regional, and national comparative risk assessment of 79 behavioural, environmental and occupational, and metabolic risks or clusters of risks, 1990-2015: a systematic analysis for the Global Burden of Disease Study 2015

    Get PDF
    SummaryBackground The Global Burden of Diseases, Injuries, and Risk Factors Study 2015 provides an up-to-date synthesis of the evidence for risk factor exposure and the attributable burden of disease. By providing national and subnational assessments spanning the past 25 years, this study can inform debates on the importance of addressing risks in context. Methods We used the comparative risk assessment framework developed for previous iterations of the Global Burden of Disease Study to estimate attributable deaths, disability-adjusted life-years (DALYs), and trends in exposure by age group, sex, year, and geography for 79 behavioural, environmental and occupational, and metabolic risks or clusters of risks from 1990 to 2015. This study included 388 risk-outcome pairs that met World Cancer Research Fund-defined criteria for convincing or probable evidence. We extracted relative risk and exposure estimates from randomised controlled trials, cohorts, pooled cohorts, household surveys, census data, satellite data, and other sources. We used statistical models to pool data, adjust for bias, and incorporate covariates. We developed a metric that allows comparisons of exposure across risk factorsā€”the summary exposure value. Using the counterfactual scenario of theoretical minimum risk level, we estimated the portion of deaths and DALYs that could be attributed to a given risk. We decomposed trends in attributable burden into contributions from population growth, population age structure, risk exposure, and risk-deleted cause-specific DALY rates. We characterised risk exposure in relation to a Socio-demographic Index (SDI). Findings Between 1990 and 2015, global exposure to unsafe sanitation, household air pollution, childhood underweight, childhood stunting, and smoking each decreased by more than 25%. Global exposure for several occupational risks, high body-mass index (BMI), and drug use increased by more than 25% over the same period. All risks jointly evaluated in 2015 accounted for 57Ā·8% (95% CI 56Ā·6ā€“58Ā·8) of global deaths and 41Ā·2% (39Ā·8ā€“42Ā·8) of DALYs. In 2015, the ten largest contributors to global DALYs among Level 3 risks were high systolic blood pressure (211Ā·8 million [192Ā·7 million to 231Ā·1 million] global DALYs), smoking (148Ā·6 million [134Ā·2 million to 163Ā·1 million]), high fasting plasma glucose (143Ā·1 million [125Ā·1 million to 163Ā·5 million]), high BMI (120Ā·1 million [83Ā·8 million to 158Ā·4 million]), childhood undernutrition (113Ā·3 million [103Ā·9 million to 123Ā·4 million]), ambient particulate matter (103Ā·1 million [90Ā·8 million to 115Ā·1 million]), high total cholesterol (88Ā·7 million [74Ā·6 million to 105Ā·7 million]), household air pollution (85Ā·6 million [66Ā·7 million to 106Ā·1 million]), alcohol use (85Ā·0 million [77Ā·2 million to 93Ā·0 million]), and diets high in sodium (83Ā·0 million [49Ā·3 million to 127Ā·5 million]). From 1990 to 2015, attributable DALYs declined for micronutrient deficiencies, childhood undernutrition, unsafe sanitation and water, and household air pollution; reductions in risk-deleted DALY rates rather than reductions in exposure drove these declines. Rising exposure contributed to notable increases in attributable DALYs from high BMI, high fasting plasma glucose, occupational carcinogens, and drug use. Environmental risks and childhood undernutrition declined steadily with SDI; low physical activity, high BMI, and high fasting plasma glucose increased with SDI. In 119 countries, metabolic risks, such as high BMI and fasting plasma glucose, contributed the most attributable DALYs in 2015. Regionally, smoking still ranked among the leading five risk factors for attributable DALYs in 109 countries; childhood underweight and unsafe sex remained primary drivers of early death and disability in much of sub-Saharan Africa. Interpretation Declines in some key environmental risks have contributed to declines in critical infectious diseases. Some risks appear to be invariant to SDI. Increasing risks, including high BMI, high fasting plasma glucose, drug use, and some occupational exposures, contribute to rising burden from some conditions, but also provide opportunities for intervention. Some highly preventable risks, such as smoking, remain major causes of attributable DALYs, even as exposure is declining. Public policy makers need to pay attention to the risks that are increasingly major contributors to global burden. Funding Bill & Melinda Gates Foundation

    Generation and Characterization of Large Non-Gaussianities in Single Field Inflation

    Full text link
    Inflation driven by a single, minimally coupled, slowly rolling field generically yields a negligible primordial non-Gaussianity. We discuss two distinct mechanisms by which a non-trivial potential can generate large non-Gaussianities. Firstly, if the inflaton traverses a feature in the potential, or if the inflationary phase is short enough so that initial transient contributions to the background dynamics have not been erased, modes near horizon-crossing can acquire significant non-Gaussianities. Secondly, potentials with small-scale structure may induce significant non-Gaussianities while the relevant modes are deep inside the horizon. The first case includes the "step" potential we previously analyzed while the second "resonance" case is novel. We derive analytic approximations for the 3-point terms generated by both mechanisms written as products of functions of the three individual momenta, permitting the use of efficient analysis algorithms. Finally, we present a significantly improved approach to regularizing and numerically evaluating the integrals that contribute to the 3-point function.Comment: 29 pp, 8 fig
    • ā€¦
    corecore